High-resolution melting analysis

ABSTRACT

The present invention relates to methods and systems for the analysis of the dissociation behavior of nucleic acids and the identification of nucleic acids. Methods and systems are disclosed for identifying a nucleic acid in a sample including an unknown nucleic acid and for detecting a single nucleotide polymorphism in a nucleic acid in a sample. Methods and systems are disclosed for identification of a nucleic acid in a biological sample including at least one unknown nucleic acid by fitting denaturation data including measurements of a quantifiable physical change of the sample at a plurality of independent sample property points to a function to determine an intrinsic physical value and to obtain an estimated physical change function, and identifying the nucleic acid in the biological sample by comparing the intrinsic physical value for at least one unknown nucleic acid to an intrinsic physical value for a known nucleic acid.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. application Ser. No.14/101,647, filed Dec. 10, 2013, which is a continuation of U.S.application Ser. No. 13/429,911, filed Mar. 26, 2012, now U.S. Pat. No.8,606,529, which is a continuation of U.S. application Ser. No.12/258,098, filed Oct. 24, 2008, now U.S. Pat. No. 8,145,433, whichclaims the benefit of U.S. Provisional Patent Application Ser. No.60/982,570, filed on Oct. 25, 2007, each of which is incorporated hereinby reference in its entirety.

BACKGROUND

1. Field of the Invention

The present invention relates to methods and systems for the analysis ofthe dissociation behavior of nucleic acids and the identification ofnucleic acids. More specifically, embodiments of the present inventionrelate to methods and systems for the analysis of denaturation data ofnucleic acids.

2. Description of Related Art

The detection of nucleic acids is central to medicine, forensic science,industrial processing, crop and animal breeding, and many other fields.The ability to detect disease conditions (e.g., cancer), infe organisms(e.g., HIV), genetic lineage, genetic markers, and the like, isubiquitous technology for disease diagnosis and prognosis, markerassisted selection, correct identification of crime scene features, theability to propagate industrial organisms and many other techniques.Determination of the integrity of a nucleic acid of interest can berelevant to the pathology of an infection or cancer. One of the mostpowerful and basic technologies to detect small quantities of nucleicacids is to replicate some or all of a nucleic acid sequence many times,and then analyze the amplification products. PCR is perhaps the mostwell-known of a number of different amplification techniques.

PCR is a powerful technique for amplifying short sections of DNA. WithPCR, one can quickly produce millions of copies of DNA starting from asingle template DNA molecule. PCR includes a three phase temperaturecycle of denaturation of DNA into single strands, annealing of primersto the denatured strands, and extension of the primers by a thermostableDNA polymerase enzyme. This cycle is repeated so that there are enoughcopies to be detected and analyzed. In principle, each cycle of PCRcould double the number of copies. In practice, the multiplicationachieved after each cycle is always less than 2. Furthermore, as PCRcycling continues, the buildup of amplified DNA products eventuallyceases as the concentrations of required reactants diminish. For generaldetails concerning PCR, see Sambrook and Russell, Molecular Cloning—ALaboratory Manual (3rd Ed.), Vols. 1-3, Cold Spring Harbor Laboratory,Cold Spring Harbor, N.Y. (2000); Current Protocols in Molecular Biology,F. M. Ausubel et al., eds., Current Protocols, a joint venture betweenGreene Publishing Associates, Inc. and John Wiley & Sons, Inc.,(supplemented through 2005) and PCR Protocols A Guide to Methods andApplications, M. A. Innis et al., eds., Academic Press Inc. San Diego,Calif. (1990).

Real-time PCR refers to a growing set of techniques in which onemeasures the buildup of amplified DNA products as the reactionprogresses, typically once per PCR cycle. Monitoring the accumulation ofproducts over time allows one to determine the efficiency of thereaction, as well as to estimate the initial concentration of DNAtemplate molecules. For general details concerning real-time PCR seeReal-Time PCR: An Essential Guide, K. Edwards et al., eds., HorizonBioscience, Norwich, U.K. (2004).

More recently, a number of high throughput approaches to performing PCRand other amplification reactions have been developed, e.g., involvingamplification reactions in microfluidic devices, as well as methods fordetecting and analyzing amplified nucleic acids in or on the devices.Thermal cycling of the sample for amplification in microfluidic devicesis usually accomplished in one of two methods. In the first method, thesample solution is loaded into the device and the temperature is cycledin time, much like a conventional PCR instrument. In the second method,the sample solution is pumped continuously through spatially varyingtemperature zones. See, e.g., Lagally et al. (Analytical Chemistry73:565-570 (2001)), Kopp et al. (Science 280:1046-1048 (1998)), Park etal. (Analytical Chemistry 75:6029-6033 (2003)), Hahn et al. (WO2005/075683), Enzelberger et al. (U.S. Pat. No. 6,960,437) and Knapp etal. (U.S. Patent Application Publication No. 2005/0042639).

Once there are a sufficient number of copies of the original DNAmolecule, the DNA can be characterized. One method of characterizing theDNA is to examine the DNA's dissociation behavior as the DNA transitionsfrom double stranded DNA (dsDNA) to single stranded DNA (ssDNA) withincreasing temperature. The process of causing DNA to transition fromdsDNA to ssDNA is sometimes referred to as a “high-resolutiontemperature (thermal) melt (HRTm)” process, or simply a “high-resolutionmelt” process.

Melt curve analysis is an important technique for analyzing nucleicacids. In some methods, a double stranded nucleic acid is denatured inthe presence of a dye that indicates whether the two strands are boundor not. Examples of such indicator dyes include non-specific bindingdyes such as SYBR® Green I, whose fluorescence efficiency dependsstrongly on whether it is bound to double stranded DNA. As thetemperature of the mixture is raised, a reduction in fluorescence fromthe dye indicates that the nucleic acid molecule has melted, i.e.,unzipped, partially or completely. Thus, by measuring the dyefluorescence as a function of temperature, information is gainedregarding the length of the duplex, the GC content or even the exactsequence. See, e.g., Ririe et al. (Anal Biochem 245:154-160, 1997),Wittwer et al. (Clin Chem 49:853-860, 2003), Liew et al. (Clin Chem50:1156-1164 (2004), Herrmann et al. (Clin Chem 52:494-503, 2006), Knappet al. (U.S. Patent Application Publication No. 2002/0197630), Wittweret al. (U.S. Patent Application Publication No. 2005/0233335), Wittweret al. (U.S. Patent Application Publication No. 2006/0019253), Sundberget al. (U.S. Patent Application Publication No. 2007/0026421) and Knightet al. (U.S. Patent Application Publication No. 2007/0231799).

Some nucleic acid assays require identification of a single nucleotidechange where the difference in melting temperature (T_(m)) between thewild type nucleic acid and the mutant nucleic acid is less than, forexample, 0.25° C. This level of temperature resolution is difficult, ifnot impossible, in standard 96 and 384 well plates. Decreasing the areaof thermal analysis can improve the spatial temperature gradient, butthere is still significant noise generated from the heating device usedto linearly ramp the samples during a thermal melt. Accordingly, whatare desired are methods and systems for high resolution melt analysisthat are capable of more accurately discriminating thermal melt curvesand obtaining DNA sequence information from these melting curves,especially where these thermal melt curves are differentiated by a smalltemperature range. Also desired are methods and systems for highresolution melt analysis that more accurately identify thermal meltcurves that facilitate detection of sequence information for DNA thatcontain one or more peaks or mutations.

SUMMARY OF THE INVENTION

The present invention relates to methods and systems for the analysis ofthe dissociation behavior of nucleic acids and the identification ofnucleic acids. More specifically, embodiments of the present inventionrelate to methods and systems for the analysis of denaturation data ofnucleic acids.

In one aspect, the present invention provides a method for identifying anucleic acid in a sample including at least one unknown nucleic acid.According to this aspect, the method comprises fitting denaturationdata, which includes measurements of a quantifiable physical change P ofthe sample at a plurality of independent sample property points x, to afunction P(x, Q) to determine an intrinsic physical value Q and toobtain an estimated physical change function, wherein the intrinsicphysical value Q is an intrinsic physical value associated with thenucleic acid in the sample, and the quantifiable physical change P isassociated with denaturation of a nucleic acid. The method furthercomprises identifying the nucleic acid in the biological sample bycomparing the intrinsic physical value Q for at least one unknownnucleic acid to the intrinsic physical value Q for a known nucleic acid.

In one embodiment, the comparison of the intrinsic physical value Q forat least one unknown nucleic acid is made to an a priori distribution ofthe intrinsic physical value Q for a known nucleic acid to determine ifthe unknown nucleic acid and the known nucleic acid are identical. Inanother embodiment, Q is one or more fitting parameters Y, Z and W andthe denaturation data is fit to one or more of these fitting parametersto determine one or more intrinsic physical values Y, Z and W. In anadditional embodiment, one or more of the intrinsic physical values Y, Zand W are determined and compared. In a further embodiment, two or moreof the intrinsic physical values Y, Z and W are determined and compared.In yet another embodiment, all of the intrinsic physical values Y, Z andW are determined and compared.

In one embodiment, the sample further includes a double-strand specificfluorescent dye. In another embodiment, the quantifiable physical changeis fluorescence intensity. In a further embodiment, quantifiablephysical change P is ultraviolet absorbance. In one embodiment, x is thesample temperature T, Y is the melting temperature T_(m), Z is the van'tHoff enthalpy ΔH and W is the amplitude A. In some embodiments, Y and Zare determined and compared in the method of the invention.

In one embodiment, the fitting step includes fitting the denaturationdata to a function P(x, Q) using a computer-implemented non-linear leastsquares algorithm. In another embodiment, the non-linear least squaresalgorithm determines the value of at least one fitting parameter, andthe intrinsic physical value Q is a fitting parameter whose value isdetermined by the non-linear least squares algorithm.

In a further embodiment, the fitting step includes fitting thedenaturation data to a function P₁(x, Q) to obtain a first estimatedphysical change function in which the function P₁(x, Q) describes therelationship of the quantifiable physical change of a sample containingone nucleic acid to the independent sample property x. In anotherembodiment, the method further comprises the steps of (i) fitting thedenaturation data to a function P₂(x, Q₁, Q₂) to obtain a secondestimated physical change function, in which the function P₂(x, Q₁, Q₂)describes the relationship of the quantifiable physical change of asample containing 2 distinct nucleic acids to the independent sampleproperty x of a sample containing 2 distinct nucleic acids, and (ii)quantifying the number of distinct nucleic acids in the sample bycomparing the first and second estimated physical change functions withthe denaturation data to determine if 1 or 2 distinct unknown nucleicacids are present in the sample. In another embodiment, the determiningstep includes determining a melting temperature T_(m(1)) for a firstunknown nucleic acid and determining a melting temperature T_(m(2)) fora second unknown nucleic acid.

In another embodiment, the method further comprises the step ofcalculating the binding fraction g_(i) of each nucleotide in the nucleicacid from the fit parameters. In a further embodiment, the methodfurther comprises the steps of calculating the analytical derivative ofthe estimated physical change function dP/dx and displaying a plot ofsaid derivative based on the fit parameters Q to the function P. In yetanother embodiment, the independent sample property x is the sampletemperature T and the intrinsic physical value Y is the meltingtemperature T_(m). In one embodiment, the fitting step is performed byfitting the denaturation data to an equation of the form

$\begin{matrix}{{{P\left( {T,T_{m}} \right)} = {B + {\sum\limits_{i = 1}^{n}P_{i}}}},\; {wherein}} & \left( {{Eq}.\mspace{14mu} 1} \right) \\{{P_{i} = {2A_{i}g_{i}^{\lbrack{- {k{({T - T_{m_{i}}})}}}\rbrack}}},} & \left( {{Eq}.\mspace{14mu} 2} \right) \\{{g_{i} = {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}} - \sqrt{\left( {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}}} \right)^{2} - 1}}},} & \left( {{Eq}.\mspace{14mu} 3} \right)\end{matrix}$

and wherein

B is the baseline measurement of the quantifiable physical property inabsence of the sample, A_(i) is the amplitude of the measurement of thequantifiable physical property of the ith nucleic acid present in thesample, ΔH_(i) is the van't Hoff enthalpy of denaturation of the ithnucleic acid present in the sample, T_(m(i)) is the melting temperatureof the ith nucleic acid present in the sample, and R is the universalgas constant.

In another embodiment, the method further comprises the step ofgenerating denaturation data from the sample. In one embodiment, thedenaturation data is thermal melt data. In another embodiment, theunknown nucleic acid is a nucleic acid containing a single nucleotidepolymorphism.

In another aspect, the present invention provides a system foridentifying a nucleic acid in a sample including at least one unknownnucleic acid. According to this aspect, the system comprises a fittingmodule capable of fitting denaturation data, including measurements of aquantifiable physical change P of the sample at a plurality ofindependent sample property points x, to a function P(x, Q) to determinean intrinsic physical value Q and to obtain an estimated physical changefunction, wherein the intrinsic physical value Q is an intrinsicphysical value associated with the nucleic acid in the sample, and thequantifiable physical change P is associated with denaturation of anucleic acid. The system further comprises an identification modulecapable of identifying the nucleic acid in the biological sample bycomparing the intrinsic physical value Q for the unknown nucleic acid tothe intrinsic physical value Q for a known nucleic acid to determine ifthe unknown nucleic acid and the known nucleic acid are identical.

In one embodiment, Q is one or more fitting parameters Y, Z and W andthe denaturation data is fit to one or more of these fitting parametersto determine one or more intrinsic physical values Y, Z and W. Inanother embodiment, one or more of the intrinsic physical values Y, Zand W are determined and compared. In an additional embodiment, two ormore of the intrinsic physical values Y, Z and W are determined andcompared. In a further embodiment, all of the intrinsic physical valuesY, Z and W are determined and compared. In one embodiment, x is thesample temperature T, Y is the melting temperature T_(m), Z is the van'tHoff enthalpy ΔH and W is the amplitude A. In some embodiments, Y and Zare determined and compared in the method according to the invention.

In one embodiment, the fitting module is a computer containinginstructions for performing a non-linear least squares algorithm. Inanother embodiment, the independent sample property x is the sampletemperature T In a further embodiment, the intrinsic physical value Y isthe melting temperature T_(m). In one embodiment, the computercontaining instructions for performing a non-linear least squaresalgorithm further contains instructions for fitting the denaturationdata to a function of the form

$\begin{matrix}{{{P\left( {T,T_{m}} \right)} = {B + {\sum\limits_{i = 1}^{n}P_{i}}}},\; {wherein}} & \left( {{Eq}.\mspace{14mu} 1} \right) \\{{P_{i} = {2A_{i}g_{i}^{\lbrack{- {k{({T - T_{m_{i}}})}}}\rbrack}}},} & \left( {{Eq}.\mspace{14mu} 2} \right) \\{{g_{i} = {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}} - \sqrt{\left( {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}}} \right)^{2} - 1}}},} & \left( {{Eq}.\mspace{14mu} 3} \right)\end{matrix}$

and wherein

B is the baseline measurement of the quantifiable physical property inabsence of the sample, A_(i) is the amplitude of the measurement of thequantifiable physical property of the ith nucleic acid present in thesample, ΔH, is the van't Hoff enthalpy of denaturation of the ithnucleic acid present in the sample, T_(m(i)) is the melting temperatureof the ith nucleic acid present in the sample, and R is the universalgas constant.

In one embodiment, the fitting module includes a computer containinginstructions for performing a non-linear least squares algorithm todetermine the value of at least one fitting parameter, in which themelting temperature T_(m) is a fitting parameter whose value is capableof being determined by the non-linear least squares algorithm. Inanother embodiment, the system further comprises a generating unitcapable of generating denaturation data from the sample. In oneembodiment, the denaturation data is thermal melt data.

In another aspect, the present invention provides a method forquantifying the number of distinct nucleic acids in a sample whichincludes at least one nucleic acid. In accordance with this aspect, themethod comprises fitting denaturation data, including measurements of aquantifiable physical change P of the sample at a plurality ofindependent sample property points x, to a function P_(n)(x) to obtain afirst estimated physical change function, wherein said function P_(n)(x)describes the relationship of the quantifiable physical change of asample containing n distinct nucleic acids to an independent sampleproperty x of a sample containing n distinct nucleic acids, and thequantifiable physical change P is associated with the denaturation of anucleic acid. The method further comprises fitting the denaturation datato a function P_(n+1)(x) to obtain a second estimated physical changefunction, wherein said function P_(n+1)(x) describes the relationship ofthe quantifiable physical change of a sample containing n+1 distinctnucleic acids to an independent sample property x of a sample containingn+1 distinct nucleic acids. The method also comprises quantifying thenumber of distinct nucleic acids in the sample by comparing the firstand second estimated physical change functions with the denaturationdata to determine if n or n+1 different nucleic acids are present in thesample.

In one embodiment, the first fitting step includes determining theintrinsic physical value Q₁ for at least one of the nucleic acidspresent in the sample, and wherein the second fitting step includesdetermining an intrinsic physical value Q₂ for at least one of thenucleic acids in the sample. In another embodiment, the first fittingstep includes fitting the denaturation data to a function P_(n)(x) usinga computer-implemented fitting algorithm, and the second fitting stepincludes fitting the denaturation data to a function P_(n+1)(x) using acomputer-implemented fitting algorithm. In a further embodiment, theindependent sample property is the sample temperature T In oneembodiment, the computer-implemented fitting algorithms are non-linearleast squares algorithms. In another embodiment, the non-linear leastsquares algorithm determines the value of at least one fittingparameter, in which the melting temperature T_(m) of at least onenucleic acid is a fitting parameter whose value is determined by thenon-linear least squares algorithm. In a further embodiment, the methodfurther comprises the step of generating denaturation data from thesample. In one embodiment, the denaturation data is thermal melt data.

In another aspect, the present invention provides a system forquantifying the number of distinct nucleic acids in a sample whichincludes at least one nucleic acid. In accordance with this aspect, thesystem comprises a fitting module capable of fitting denaturation data,including measurements of a quantifiable physical change P of the sampleat a plurality of independent sample property points x, to a functionP_(n)(x) to obtain an n nucleic acid estimated physical change function,wherein said function P_(n)(x) describes the relationship of thequantifiable physical change of a sample containing n distinct nucleicacids to the independent sample property of a sample containing ndistinct nucleic acids, and the quantifiable physical change isassociated with the denaturation of a nucleic acid. The system furthercomprises a quantification module capable of quantifying the number ofdistinct nucleic acids in the sample by comparing an n nucleic acidphysical change function and an n+1 nucleic acid physical changefunction with the denaturation data to determine if n or n+1 differentnucleic acids are present in the sample.

In one embodiment, the independent sample property x is the sampletemperature T. In another embodiment, the fitting module is capable ofestimating the melting temperature T_(m) for at least one of the nucleicacids present in the sample, in which the fitting module performs anestimation of the melting temperature T_(m) for at least one of thenucleic acids in the sample. In a further embodiment, the fitting moduleis a computer containing instructions for fitting the denaturation datato a function P_(n)(T) via a non-linear least squares algorithm. In oneembodiment, the non-linear least squares algorithm is capable ofdetermining the value of at least one fitting parameter, in which themelting temperature T_(m) of at least one nucleic acid is a fittingparameter whose value is capable of being determined by the non-linearleast squares algorithm. In another embodiment, the system furthercomprises a generating unit capable of generating denaturation data fromthe sample. In one embodiment, the denaturation data is thermal meltdata.

BRIEF DESCRIPTION OF THE FIGURES

The accompanying drawings, which are incorporated herein and form partof the specification, illustrate various embodiments of the presentinvention.

FIG. 1 illustrates a Savitsky-Golay derivative curve plot for a wildtype and single nucleotide mutant nucleic acid.

FIG. 2 illustrates the low pass filtering effect of the Savitsky-Golayfilter.

FIG. 3 illustrates that the amplitude frequency response of theSavitsky-Golay filter and that the attenuation of the −dF/dT curve.

FIG. 4 illustrates a method of determining the presence of multiplenucleic acids in accordance with one aspect of the present invention.

FIG. 5 shows a visual representation of iterative steepest descentalgorithm with two unknown parameters, P₁ and P₂, in accordance with oneaspect of the present invention.

FIG. 6 illustrates a flow chart showing the adaptive procedure fordetermining model equations complexity (number of peaks or nucleotides)in accordance with one aspect of the present invention.

FIGS. 7A and 7B illustrate a method for describing a melt curve using asingle peak in accordance with one aspect of the present inventionobtained from wild type salmonella DNA. In this example, a single peakis adequate to describe this data as the F-ratio between the two peakmodel fit (7B) and one peak model fit (7A) is 2.45 which is relativelylow.

FIGS. 8A and 8B illustrate a method for describing a melt curve usingtwo peaks in accordance with one embodiment of the present invention. Inthis example, two peaks (or nucleotides) are necessary to describe thisdata as the F-ratio between the two peak model fit (8B) and one peakmodel fit (8A) is 1500 which is relatively high.

FIG. 9 illustrates a flow chart showing a method in accordance with oneaspect of the present invention.

FIG. 10 illustrates a microfluidic device in accordance with someaspects of the present invention.

FIG. 11 illustrates the distribution of F-ratios for a group of meltcurves of known genotypes for a particular assay and set of experimentalconditions.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The present invention has several embodiments and relies on patents,patent applications and other references for details known to those ofthe art. Therefore, when a patent, patent application, or otherreference is cited or repeated herein, it should be understood that itis incorporated by reference in its entirety for all purposes as well asfor the proposition that is recited.

The practice of the present invention may employ, unless otherwiseindicated, conventional techniques and descriptions of organicchemistry, polymer technology, molecular biology (including recombinanttechniques), cell biology, biochemistry, and immunology, which arewithin the skill of the art. Such conventional techniques includepolymer array synthesis, hybridization, ligation, and detection ofhybridization using a label. Specific illustrations of suitabletechniques can be had by reference to the example herein below. However,other equivalent conventional procedures can, of course, also be used.Such conventional techniques and descriptions can be found in standardlaboratory manuals such as Genome Analysis: A Laboratory Manual Series(Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A LaboratoryManual, PCR Primer: A Laboratory Manual, and Molecular Cloning: ALaboratory Manual (all from Cold Spring Harbor Laboratory Press),Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, N.Y., Gait,Oligonucleotide Synthesis: A Practical Approach, 1984, IRL Press,London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rdEd., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002)Biochemistry, 5th Ed., W. H. Freeman Pub., New York, N.Y., all of whichare herein incorporated in their entirety by reference for all purposes.

Thermal melt curves of fluorescence have been used to determine themelting temperature of a DNA strand when denatured from the duplex stateto the two separate single strands via a ramp increase in temperature.Typically, the melting temperature or T_(m) is defined to be thetemperature at which 50% of the paired DNA strands have denatured intosingle strands. Intercalating dyes that fluoresce when bound to doublestranded DNA and lose their fluorescence when denatured are often usedin measuring T_(m). Typically, the negative derivative of fluorescencewith respect to temperature (−dF/dT) has been used in the determinationof T_(m). In typical systems, the temperature at the peak −dF/dT is usedas an estimate of the melting temperature T_(m).

The −dF/dT derivative curve is typically obtained using a Savitsky-Golay(SG) derivative filter which is capable of estimating the derivative ofany signal. Savitsky-Golay filters are low pass, Finite Impulse Response(FIR) derivative filters, and their application to any dynamical signalis obtained through the convolution of the FIR filter parameters withthe raw signal. When the spacing of the independent variable is uniform,the filtered results can give first order and higher order derivativesof the dependant variable relative to the independent variableequivalent. The effect of such a filter is equivalent to a movingpolynomial fit, followed by the evaluation of the derivative of thatpolynomial evaluated at the center of the window. To use the SG filterthe temperature difference between consecutive points has to be exactlyequal (perfect ramp in temperature), otherwise there are potentialproblems, such as the lowering and broadening of peaks, due to the lowpass filtering effect of the SG filter. Furthermore, the degree offiltering depends on the polynomial order and window size (or number ofpoints). In the frequency domain, sharper peaks are further attenuatedthan broader ones. Perhaps the greatest shortcoming of the SG derivativefilter is its inability to resolve and detect multiple meltingtemperatures for heterozygous mutant DNA when there are two or moreT_(m) temperatures that are in close proximity. In one aspect of thepresent invention, methods and systems are described that do not sufferfrom the shortcomings of using SG derivative filters and that have theability to detect one or more melting temperature(s) from DNA thermalmelt data as well as other thermodynamic parameters for each meltingtemperature.

In addition, when the melting temperatures between the wild type nucleicacid and the mutant nucleic acid are close enough, an overlap will beformed between two derivative melting curves, e.g., the derivativemelting curves that are obtained for the wild type allele and the mutantallele in a heterozygous sample. Such an overlap in the derivativemelting curves can effect the measurement of the melting temperature. Inother aspects of the present invention, methods and systems aredescribed that resolve the melting curves to precisely measure themelting temperatures of the alleles present in the sample.

The present invention relates to methods and systems for the analysis ofthe dissociation behavior of nucleic acids and the identification ofnucleic acids. More specifically, the present invention relates tomethods and systems for the analysis of denaturation data of nucleicacids and the identification of nucleic acids. For example, meltingcurve analysis can be used to detect single nucleotide polymorphisms(SNPs). Molecular melt curves (and differences between molecular meltcurves) can also be used to detect and analyze sequence differencesbetween nucleic acids. The thermal denaturation curve for nucleic acidscan be monitored by, for example, measuring thermal parameters,fluorescence of indicator dyes/molecules, fluorescence polarization,dielectric properties, or the like.

Melting curve analysis is typically carried out either in a stopped flowformat or in a continuous flow format. In one example of a stopped flowformat, flow is stopped within a microchannel of a microfluidic devicewhile the temperature in that channel is ramped through a range oftemperatures required to generate the desired melt curve. In analternative stopped flow format, melting curve analysis is done in achamber to which the nucleic acid sample has been added. In one exampleof a continuous flow format, a melting curve analysis is performed byapplying a temperature gradient along the length (direction of flow) ofa microchannel of a microfluidic device. If the melting curve analysisrequires that the molecules being analyzed be subjected to a range oftemperatures extending from a first temperature to a second temperature,the temperature at one end of the microchannel is controlled to thefirst temperature, and the temperature at the other end of the length iscontrolled to the second temperature, thus creating a continuoustemperature gradient spanning the temperature range between the firstand second selected temperatures. An example of an instrument forperforming a melting curve analysis is disclosed in U.S. PatentApplication Publication No. 2007/0231799, incorporated herein byreference in its entirety.

The denaturation data that is analyzed in accordance with aspects of thepresent invention is obtained by techniques well known in the art. See,e.g., Knight et al. (U.S. Patent Application Publication No.2007/0231799); Knapp et al. (U.S. Patent Application Publication No.2002/0197630); Wittwer et al. (U.S. Patent Application Publication No.2007/0020672); and Wittwer et al. (U.S. Pat. No. 6,174,670). Althoughthe present invention is applicable to the analysis of denaturation dataobtained in any environment, it is particularly useful for denaturationdata obtained in the microfluidic environment because of the need forgreater sensitivity in this environment.

In accordance with certain aspects of the invention, thermal melt datais generated by elevating the temperature of a molecule or molecules,e.g., of one or more nucleic acids, for a selected period of time andmeasuring a detectable property emanating from the molecule ormolecules, wherein the detectable property indicates an extent ofdenaturation of the nucleic acid. This period of time can range, forexample, from about 0.01 second through to about 1.0 minute or more,from about 0.01 second to about 10 seconds or more, or from about 0.1second to about 1.0 second or more, including all time periods inbetween. In one embodiment, heating comprises elevating the temperatureof the molecule or molecules by continuously increasing the temperatureof the molecule or molecules. For example, the temperature of themolecule(s) can be continuously increased at a rate in the range ofabout 0.1° C./second to about 1° C./second. Alternatively, thetemperature of the molecule(s) can be continuously increase at a slowerrate, such as a rate in the range of about 0.01° C./second to about 0.1°C./second, or at a faster rate, such as a rate in the range of about 1°C./second to about 10° C./second. The heating can occur throughapplication of an internal or an external heat source, as is known inthe art.

The actual detection of a change(s) in a physical property of themolecules can be detected in numerous methods depending on the specificmolecules and reactions involved. For example, the denaturation of themolecules can be tracked by following fluorescence or emitted light frommolecules in the assay. The degree of, or change in, fluorescence iscorrelational or proportional to the degree of change in conformation ofthe molecules being assayed. Thus, in some methods, the detection of aproperty of the molecule(s) comprises detecting a level of fluorescenceor emitted light from the molecules(s) that varies as a function ofrelative amounts of binding. In one configuration, the detecting offluorescence involves a first molecule and a second molecule, whereinthe first molecule is a fluorescence indicator dye or a fluorescenceindicator molecule and the second molecule is the target molecule to beassayed. In one embodiment, the fluorescence indicator dye orfluorescence indicator molecule binds or associates with the secondmolecule by binding to hydrophobic or hydrophilic residues on the secondmolecule. The methods of detecting optionally further comprise excitingthe fluorescence indicator dye or fluorescence indicator molecule tocreate an excited fluorescence indicator dye or excited fluorescenceindicator molecule and discerning and measuring an emission or quenchingevent of the excited fluorescence indicator dye or fluorescenceindicator molecule.

In aspects of the present invention, the thermal melt data can be usedto generate a thermal property curve. In some methods, the generation ofa thermal property curve includes providing one molecule comprising afluorescence indicator dye or fluorescence indicator molecule, and atleast a second molecule comprising, one or more of an enzyme, a ligand,a peptide nucleic acid, a cofactor, a receptor, a substrate, a protein,a polypeptide, a nucleic acid (either double-stranded orsingle-stranded), an antibody, an antigen, or an enzyme complex.Fluorescence of the first molecule in the presence of the secondmolecule as a function of temperature is measured and the resulting datais used to generate a thermal property curve. In other methods, thegeneration of a thermal property curve comprises measuring a change inthe fluorescence of one molecule that is correlative or proportional toa change in a physical property of another molecule(s) due to a changein temperature. In still other methods, the generation of a thermalproperty curve comprises measuring the change in the total free energyof the system as a function of temperature without the presence of asecond molecule. Typically, the methods also include generating athermal property curve of a control or known sample in a similar manner.

Several techniques exist for the measurement of the denaturation of themolecules of interest, and any of these can be used in generating thedata to be analyzed in accordance with aspects of the present invention.Such techniques include fluorescence, fluorescence polarization,fluorescence resonance energy transfer, circular dichroism and UVabsorbance. Briefly, the fluorescence techniques involves the use ofspectroscopy to measure changes in fluorescence or light to track thedenaturation/unfolding of the target molecule as the target molecule issubjected to changes in temperature. Spectrometry, e.g. viafluorescence, is a useful method of detecting thermally induceddenaturation/unfolding of molecules. Many different methods involvingfluorescence are available for detecting denaturation of molecules (e.g.intrinsic fluorescence, numerous fluorescence indicator dyes ormolecules, fluorescence polarization, fluorescence resonance energytransfer, etc.) and are optional embodiments of the present invention.These methods can take advantage of either internal fluorescentproperties of target molecules or external fluorescence, i.e. thefluorescence of additional indicator molecules involved in the analysis.

A method of measuring the degree of denaturation/unfolding of the targetmolecule is through monitoring of the fluorescence of dyes or moleculesadded to the microfluidic device along with the target molecule and anytest molecules of interest. A fluorescence dye or molecule refers to anyfluorescent molecule or compound (e.g., a fluorophore) which can bind toa target molecule either once the target molecule is unfolded ordenatured or before the target molecule undergoes conformational changeby, for example, denaturing and which emits fluorescent energy or lightafter it is excited by, for example, light of a specified wavelength.

One dye type used in the microfluidic devices is one that intercalateswithin strands of nucleic acids. The classic example of such a dye isethidium bromide. An exemplary use of ethidium bromide for bindingassays includes, for example, monitoring for a decrease in fluorescenceemission from ethidium bromide due to binding of test molecules tonucleic acid target molecules (ethidium bromide displacement assay).See, e.g., Lee, M. et al. (J Med Chem 36(7):863-870 (1993)). The use ofnucleic acid intercalating agents in measurement of denaturation is wellknown to those in the art. See, e.g., Haugland (Handbook of FluorescentProbes and Research Chemicals, Molecular Probes, Inc., Eugene, Oreg.(1996)).

Dyes that bind to nucleic acids by mechanisms other than intercalationcan also be employed in embodiments of the invention. For example, dyesthat bind the minor groove of double stranded DNA can be used to monitorthe molecular unfolding/denaturation of the target molecule due totemperature. Examples of suitable minor groove binding dyes are the SYBRGreen family of dyes sold by Molecular Probes Inc. (Eugene, Oreg., USA).See, e.g., Haugland (Handbook of Fluorescent Probes and ResearchChemicals, Molecular Probes, Inc., Eugene, Oreg., USA (1996)). SYBRGreen dyes will bind to any double stranded DNA molecule. When a SYBRGreen dye binds to double stranded DNA, the intensity of the fluorescentemissions increases. As more double stranded DNA are denatured due toincreasing temperature, the SYBR Green dye signal will decrease. Anothersuitable dye is LCGreen Plus sold by Idaho Technology, Inc. (Salt LakeCity, Utah, USA).

Fluorescence polarization (FP) provides a useful method to detecthybridization formation between molecules of interest. This method isespecially applicable to hybridization detection between nucleic acids,for example, to monitor single nucleotide polymorphisms (SNPs).Generally, FP operates by monitoring, the speed of rotation offluorescent labels, such as fluorescent dyes or molecular beacons, e.g.before, during, and/or after binding events between molecules thatcomprise the test and target molecules. In short, binding of a testmolecule to the target molecule ordinarily results in a decrease in thespeed of rotation of a bound label on one of the molecules, resulting ina change in FP.

Fluorescence resonance energy transfer (FRET) can be used to track theconformational changes of the target molecule (and interactions withtest molecules which can bind with the target molecule) as a function oftemperature. FRET relies on a distance-dependent transfer of energy froma donor fluorophore to an acceptor fluorophore. If an acceptorfluorophore is in close proximity to an excited donor fluorophore, thenthe emission of the donor fluorophore can be transferred to the acceptorfluorophore. This causes a concomitant reduction in the emissionintensity of the donor fluorophore and an increase in the emissionintensity of the acceptor fluorophore. Since the efficiency of theexcitation transfer depends, inter alia, on the distance between the twofluorophores, the technique can be used to measure extremely smalldistances such as would occur when detecting changes in conformation.This technique is particularly suited for measurement of bindingreactions, protein-protein interactions, e.g., such as a protein ofinterest binding to an antibody and other biological events altering theproximity of two labeled molecules. Many appropriate interactive labelsare known. For example, fluorescent labels, dyes, enzymatic labels, andantibody labels are all appropriate.

Circular dichroism (CD) can be used to follow the conformational changesof the target molecules/text molecules as a function of temperature andcan be used to construct molecular melt curves. CD is a type of lightabsorption spectroscopy which measures the difference in absorbance by amolecule between right-circularly polarized light and left-circularlypolarized light. CD is quite sensitive to the structure of polypeptidesand proteins.

UV absorbance can also be used to detect and/or track denaturation ofnucleic acid molecules, and/or to quantify the total amount of nucleicacid. UV can be employed to measure the extent of denaturation becausethe UV absorbance value of single stranded nucleic acid molecules isgreater than the absorbance value of double stranded nucleic acidmolecules.

Once the denaturation data has been obtained and melt curves generated,if desired, the data and/or melt curves are then analyzed to identifythe molecules in the sample, such as, for example, the identification ofa nucleic acid in a sample. This analysis is performed in accordancewith the aspects of the present invention which are described herein.

In accordance with the certain aspects of invention, the originalfluorescence data, that is, the raw fluorescence versus temperaturedata, is fitted directly to a model and then analyzed, without firstapplying a Savitsky-Golay (SG) filter to the data. In accordance withcertain aspects of the invention, a model includes the temperaturedependence of the fluorescence from the fluorophore used to generate thedenaturation data. In accordance with certain aspects of the invention,the methods and systems can determine whether there is only one peak ormore than one peak. Each of the aspects of the present invention can beimplemented by a computer which would collect the data, analyze the dataas in accordance with the methods described herein, and then provide aresult of the analysis, which could include, for example, T_(m)(s) ofthe nucleic acid(s) in the sample(s) and/or the identity of the nucleicacid(s) or SNP(s) in the sample(s).

FIG. 1 shows a negative derivative (−dF/dT) curve plot for a wild typeand single nucleotide mutant nucleic acid associated with sickle cellanemia as obtained from an eight channel microfluidic device. Thenegative derivative curve plot is obtained from the fluorescence datausing a Savitsky-Golay (SG) filter as known in the art. The T_(m) isdetermined as the peak of the derivative curve and is reported indegrees Celsius. The delta T_(m), or the difference in s between the twonucleic acids, is about 0.5° C. As the delta T_(m) becomes smaller, itbecomes increasing difficult to distinguish between two possible nucleicacids using the T_(m) value alone.

In addition, problems arise with the use of SG filters in obtainingnegative derivative curve data. FIG. 2 illustrates problems that canarise with the use of SG filters. In particular, FIG. 2 illustrates thesimulation of a melting curve based on a mathematical model to determinethe shape of a thermal melt curve in conjunction with its analyticallyderived derivative in the lower subplot. FIG. 2 shows that the low passfiltering effect of the Savitsky-Golay filter is dependant on the windowsize and polynomial order. The SG filter attenuates the true derivativesignal. Also shown are various SG derivative plots based on differentpolynomial orders and window sizes. As shown, the low pass filteringeffect of the SG filters lowers and broadens the peak. Furthermore, thedegree of filtering depends on the polynomial order and window size (ornumber of points). In the frequency domain, it can also be shown thatsharper peaks are further attenuated than broader ones, as shown by theamplitude-frequency response curve shown in FIG. 3. In particular, FIG.3 shows that the attenuation of the −dF/dT curve caused by the SG filteris variable not only in the polynomial order and window size but also inthe sharpness of the peak.

The true melting temperature of a nucleic acid is that where 50% of thepaired nucleic acid exists in the unbound state due to the temperatureincrease (T_(m) definition 1). It is important to note that thetemperature at which the DNA unbinds at the maximal rate defined by thepeak of the derivative curve (T_(m) definition 2) is not the same as thetemperature where 50% of the DNA is unbound. The method according tosome aspects of the present invention can give the melting temperatureas defined by either definition, as shown in FIG. 2. The methodaccording to some aspects of the present invention does not require thederivation of −dF/dT to obtain T_(m) according to definition 1. Howeverit can derive the derivative curve if necessary and FIG. 11 shows thatthis yields a curve very close to the true derivative, unlike with theSG filters.

Perhaps the greatest shortcoming of the SG derivative filter is itsinability to resolve and detect multiple melting temperatures forheterozygous mutant DNA when there are two or more T_(m) temperaturesthat are in close proximity. The method according to this aspect of thepresent invention has a much greater chance of overcoming thisdeficiency, as illustrated in FIG. 4. FIG. 4 shows that the method fordetermining melting temperatures according to one embodiment of thepresent invention is more successful at detecting the presence of twonucleic acids where their individual T_(m)s are close (such as, forexample, within 1.5° C.) of each other. The total derivative which theSavitzky-Golay filter estimates can not detect the presence of twonucleotides when their melting temperatures are close.

In accordance with other aspects, the present invention provides methodsand systems that utilize an algorithm that has the ability to detect oneor more melting temperature(s) from nucleic acid thermal melt data aswell as other thermodynamic parameters such as the enthalpy andamplitude (signal level) for each melting temperature. In one aspect,the methods and systems are based on fitting thermal melt data to amathematical model through non-linear least squares fitting ofparameters. But this algorithm does more than fit data to equations. Ithas the ability to adaptively expand the equation set to detect multiplefeatures due to multiple mutations, but only if necessary. Based on thismethod, results yield more accurate information and more meltingtemperatures that are more robust than in connection with the SG method.Furthermore, the present methods produce a −dF/dT curve that is closerto the true curve, if such a curve is desired. In accordance with thisaspect of the invention, the model assumes that the change in heatcapacity is zero.

Each feature has a unique melting temperature (Tm_(i))_(e), enthalpy(ΔH_(i)), florescence amplitude (A_(i)), a common decay constant (k) andfluorescence offset (B), each defined by the following equations.

$\begin{matrix}{\mspace{79mu} {{{Total}\mspace{14mu} {Fluorescence}\mspace{14mu} {Equation}}\mspace{20mu} {F_{sum} = {B + {\sum\limits_{i = 1}^{N_{pk}}F_{i}}}}}} & {{Equation}\mspace{14mu} 1} \\{\mspace{79mu} {{{Flourescence}\mspace{14mu} {of}\mspace{14mu} {Individual}\mspace{14mu} {Features}}\mspace{20mu} {F_{i} = {2A_{i}g_{i}^{\lbrack{- {k{({T - T_{m_{i}}})}}}\rbrack}}}}} & {{Equation}\mspace{14mu} 2} \\{\mspace{79mu} {{{Binding}\mspace{14mu} {Fraction}\mspace{14mu} {of}\mspace{14mu} {Individual}\mspace{14mu} {Features}}{g_{i} = {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}} - \sqrt{\left( {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}}} \right)^{2} - 1}}}}} & {{Equation}\mspace{14mu} 3}\end{matrix}$

In the detection of one feature, the vector of parameter estimates is:

$P = \begin{bmatrix}T_{m} \\{\Delta \; H} \\A \\B \\k\end{bmatrix}$

In the detection of two features the vector of parameter estimates is:

$P = \begin{bmatrix}T_{m1} \\{\Delta \; H_{1}} \\A_{1} \\\begin{matrix}T_{m2} \\{\Delta \; H_{2}} \\A_{2}\end{matrix} \\\begin{matrix}B \\k\end{matrix}\end{bmatrix}$

In general, the number of parameters in the detection of N_(pk) featuresis equal to 3(N_(pk))+2. For any number of features, the fluorescencemelt data is fit to the mathematical model, such as described herein. Inaccordance with this aspect of the invention, this mathematical modelassumes that the change in heat capacity is zero, and thus does need tobe determined.

In one aspect, the present invention utilizes non-linear least squaresfitting to identify melting temperatures and other useful parameters.The general form of any non-linear equation (regardless of the number ofparameters) can be written as a function of the independent variable andthe constant parameters that one wishes to solve for:

y=ƒ(x,P ₁ ,P ₂ , . . . ,P _(m))

Thus the i^(th) dependent variable in terms if the i^(th) independentvariable is of the form:

y _(i)=ƒ(x _(i) ,P ₁ ,P ₂ , . . . ,P _(m)).

Unconstrained nonlinear least-squares data fitting by the Gauss-Newtonor Marquardt-Levenberg method can be applied as follows:

1. Start with an initial estimation for each of the unknown parameters:

$P = {{\begin{matrix}P_{1} \\P_{2} \\\vdots \\P_{m}\end{matrix}}.}$

2. For each temperature reading (x_(i)) value, using the currentparameters in P, evaluate the estimated fluorescence value (ŷ_(i)):

ŷ_(i)=ƒ(x_(i), P₁, P₂, . . . , P_(m)). In vector form

${\hat{Y} = {\begin{matrix}{\hat{y}}_{1} \\{\hat{y}}_{2} \\\vdots \\{\hat{y}}_{n}\end{matrix}}},$

if there are n pairs.

3. Compute the residual vector R, which is the difference between themeasured fluorescence and estimated fluorescence:

$R = {{\begin{matrix}r_{1} \\r_{2} \\\vdots \\r_{n}\end{matrix}} = {{\hat{F} - F} = {{\begin{matrix}{\hat{f}}_{1} \\{\hat{f}}_{2} \\\vdots \\{\hat{f}}_{n}\end{matrix}} - {{\begin{matrix}f_{1} \\f_{2} \\\vdots \\f_{n}\end{matrix}}.}}}}$

4. Compute the Sum of the Squared Errors scalar value for the currentestimates:

${SSE} = {\sum\limits_{i = 1}^{n}{\left( {\overset{\Cap}{f_{i}} - f_{i}} \right)^{2}.}}$

In matrix form, SSE=R^(T)R.

If the percent change in SSE from the proceeding value is less than orequal to some small tolerance value, such as, for example, 0.01%, thenstop here and use the current parameters in P as the parameters thatyield minimum SSE. For example:

If$\frac{{SSE}_{{this}\mspace{14mu} {iteration}} - {SSE}_{{last}\mspace{14mu} {iteration}}}{{SSE}_{{this}\mspace{14mu} {iteration}}} \leq 10^{- 4}$then  STOP.

5. Take an infinitesimal step, ∂P away along the axis of each parameterin order to discretely calculate the local gradient or Jacobian:

${{\partial P} = {{\begin{matrix}{\partial P_{1}} \\{\partial P_{2}} \\\vdots \\{\partial P_{m}}\end{matrix}} = {{\gamma \cdot P} = {{\gamma \cdot {\begin{matrix}P_{1} \\P_{2} \\\vdots \\P_{m}\end{matrix}}} = {\begin{matrix}{\gamma \; P_{1}} \\{\gamma \; P_{2}} \\\vdots \\{\gamma \; P_{m}}\end{matrix}}}}}},$

where γ is a very small scalar constant (on the order of approximately10⁻⁸).

6. Compute the Jacobian (partial derivative) matrix of the function ƒwith respect to each parameter at the current estimate of the function,ŷ for each x value:

$\mspace{79mu} {{J = \begin{bmatrix}\frac{\partial{\hat{y}}_{1}}{\partial P_{1}} & \frac{\partial{\hat{y}}_{1}}{\partial P_{2}} & \ldots & \frac{\partial{\hat{y}}_{1}}{\partial P_{m}} \\\frac{\partial{\hat{y}}_{2}}{\partial P_{1}} & \frac{\partial{\hat{y}}_{2}}{\partial P_{2}} & \ldots & \frac{\partial{\hat{y}}_{2}}{\partial P_{m}} \\\vdots & \vdots & \ddots & \vdots \\\frac{\partial{\hat{y}}_{n}}{\partial P_{1}} & \frac{\partial{\hat{y}}_{n}}{\partial P_{2}} & \ldots & \frac{\partial{\hat{y}}_{n}}{\partial P_{m}}\end{bmatrix}},\mspace{79mu} {where}}$${\frac{\partial{\hat{y}}_{i}}{\partial P_{1}}\mspace{14mu} {is}\mspace{14mu} {equated}\mspace{14mu} {discretely}\mspace{14mu} {as}\mspace{14mu} \frac{{f\left( {x_{i},{P_{1} + {\partial P_{1}}},P_{2},\ldots \mspace{14mu},P_{m}} \right)} - {f\left( {x_{i},P_{1},P_{2},\ldots \mspace{14mu},P_{m}} \right)}}{\partial P_{1}}},{\frac{\partial{\hat{y}}_{i}}{\partial P_{2}}\mspace{14mu} {is}\mspace{14mu} {equated}\mspace{14mu} {discretely}\mspace{14mu} {as}\mspace{14mu} \frac{{f\left( {x_{i},P_{1},{P_{2} + {\partial P_{2}}},P_{2},\ldots \mspace{14mu},P_{m}} \right)} - {f\left( {x_{i},P_{1},P_{2},\ldots \mspace{14mu},P_{m}} \right)}}{\partial P_{2}}\ldots}$$\frac{\partial{\hat{y}}_{i}}{\partial P_{n}}\mspace{14mu} {is}\mspace{14mu} {equated}\mspace{14mu} {discretely}\mspace{14mu} {as}\mspace{14mu} {\frac{{f\left( {x_{i},P_{1},{P_{2} + {\partial P_{2}}},\ldots \mspace{14mu},P_{m}} \right)} - {f\left( {x_{i},P_{1},P_{2},\ldots \mspace{14mu},P_{m}} \right)}}{\partial P_{n}}.}$

7. Compute ΔP which is a step vector that is to be added to the currentP vector in the next iteration as follows:

ΔP=pinv(J)*R=(J ^(T) J)⁻¹ J ^(T) R.

8. Generate a new vector of parameters:

P=P+ΔP.

9. Repeat steps 2 through 8 until the percent change in SSE is less thanor equal to the tolerance value in step 4, or until a maximum set amountof iterations have been performed, such as, for example, 100. In oneembodiment, each time this loop is run, the number of iterationsincrements by 1. If the maximum number of iterations has been performedand the percent change in SSE is greater than the tolerance value, thenthe parameters did not converge to a local minimum.

An explanation of how the algorithm functions mathematically isexplained in connection with the illustration in FIG. 5. FIG. 5 shows avisual representation of iterative steepest descent algorithm with twounknown parameters, P₁ and P₂ (e.g. Tm, ΔH, etc. . . . ). The procedureis the same with more than two unknowns but is difficult to visualize asmore than three dimensions are needed. Note that the number of unknownsfor Npk peaks is 3*Npk+2. An initial estimation of the parameters (P₁,P₂) is located somewhere on the shaded surface shown in FIG. 5. From thecurrent position (iteration), a new vector, ΔP, is calculated whosemagnitude and direction is influenced by the Jacobian matrix J, and theresidual vector R. The vector ΔP points in the direction of steepestdescent (slope) on the surface and results in the identification of anew point P. This procedure iteratively repeats until the surface isalmost flat (tolerance is reached). This means that the local minimumpoint is largely obtained and the iteration process can be stopped. Ofcourse, this technique is not limited to fitting equations 1 through 3.

In accordance with some aspects, the method includes estimating qualityof the results by using the Jacobian matrix to estimate the StandardError (or deviation) of each of the parameter estimates, and usingresiduals and parameter estimates to obtain SNR. Once the least squaresset of parameters defined in vector P is obtained, the vector consistingof standard error (or deviation) of each parameter can be estimated. Thecovariance matrix of parameter estimates can be calculated as:

$C = \frac{R^{T}{R\left( {J^{T}J} \right)}^{- 1}}{n - m}$

wherein

R and J are the residual vector and Jacobian matrix respectively whichare defined above, n is the number of fluorescence-temperature datapoints included in the fit and m is the number of parameters that arefit.

The standard deviation of the estimated parameters can be equated as thesquare root of the diagonal elements of C:

${STDP} = {\begin{bmatrix}\sqrt{C_{1,1}} \\\sqrt{C_{2,2}} \\\vdots \\\sqrt{C_{m,m}}\end{bmatrix}.}$

The amount or standard deviation of noise can be equated as the standarddeviation of the residual vector which is a scalar:

${stdr} = {\sqrt{\frac{\sum\limits_{k = 1}^{n}\left( {r_{i} - \overset{\_}{r}} \right)^{2}}{n - 1}}.}$

The signal magnitude is determined by the fit parameter A_(i) for eachfeature. A is the florescence level above the base fluorescence B, atthe melting temperature T_(m).

The signal to noise ratio for each feature, is quantified as:

${SNR}_{i} = {\frac{A_{i}}{stdr}.}$

In accordance with aspects of the invention, an adaptive, intelligentdetermination of the number of features (1, 2 or 3 etc. . . . ) can bedetermined. In fitting the parameters to equations 1 through 3, it isdesirable to know how many features or nucleotides there are to fit tothe correct equation as i goes from 1 to N_(pk).

FIG. 6 illustrates a flow chart for a process 600 for an adaptiveprocedure for determining model equation complexity, i.e., number ofpeaks or nucleotides, in accordance with an embodiment of the presentinvention. Process 600 may begin in step 601 in which all melt curvesare initially assumed to consist of 1 peak. The data is then fit twice,once assuming a mathematical model with 1 peak, and once assuming amodel consisting of 2 peaks. Thus, in step 602, the melt curve is fit toa mathematical model with N_(pk) peak(s) using nonlinear-least squaresprocedure [m₁=3N_(pk)+2], while in step 612, the melt curve is fit to amathematical model with N_(pk) peak(s) using nonlinear-least squaresprocedure [m₂=3(N_(pk+1))+2]. In step 603, the Sum Squared Error (SSE₁)is calculated, the Residual Runs Test (RRT₁) is performed and theStandard Deviation of Fit Parameters (STDP₁) is calculated for the datafrom step 602, while in step 613, the Sum Squared Error (SSE₂) iscalculated, the Residual Runs Test (RRT₂) is performed and the StandardDeviation of Fit Parameters (STDP₂) is calculated for the data from step612.

In step 604, an F-test is performed comparing SSE₁ from step 603 andSSE₂ from step 613. In one embodiment, three queries of the data aremade in step 605. First, is the F-ratio of SSE₂ relative to SSE₁ muchgreater than the predicted threshold? Second, does RRT₁ fail and RRT₂pass? Third, is there a significant reduction in STDP₂ from STDP₁? Ifthe answers to these queries are no, then the Model 1 parameter resultswith N_(pk) peaks is used. If the answer is yes to any or all of thequeries, then in step 606, N_(pk) is incremented by one and the data isagain fit to the two models by repeating the steps described above. Insummary, if the fit results show a significant improvement with theadded peak, then that more complex mathematical model is chosen over thesimpler one.

In one embodiment, the threshold is determined by calculating theF-ratio of many melt curves of known nucleic acids, such as, forexample, as illustrated in FIG. 11. It is known that the wild type andhomozygous mutant nucleic acids have one feature and that the singleheterozygous mutant has two features. Based on the F-ratio distributionsof these nucleic acids, approximately 750 would be a good threshold inthis example. The F-ratio distributions (and thus the threshold) maychange for different assays and different experimental conditions, butcan be pre-determined from a data set specific to those same conditions.

Because the model with the added peak is a superset of the simplermodel, SSE₂ is always less that SSE₁. But it is desirable to knowwhether the improvement in SSE is worth the extra 3 parameters needed todescribe the added feature. The F-test in step 604 is used to answerthis question. The F-test for comparing a 1-peak model to a 2-peak modelis given by the formula:

$F_{ratio} = \frac{\left( {{SSE}_{1} - {SSE}_{2}} \right)/\left( {m_{2} - m_{1}} \right)}{{SSE}_{2}/\left( {n - m_{2}} \right)}$

in which

m₁ is the number of fit parameters for the 1-peak model which is 5, m₂is the number of fit parameters for the 2-peak model which is 8 and n isthe number of data points included in the fit. If the simpler model iscorrect, an F ratio near 1.0 would be expected. If the ratio is muchgreater than 1.0, then it is highly likely that the more complicatedmodel is correct.

To illustrate the application of a model in accordance with oneembodiment, melt curve data were generated for homozygous wild typesalmonella DNA and for heterozygous mutant salmonella DNA. FIGS. 7A and7B show the data and analysis for the homozygous wild type salmonellaDNA and FIGS. 8A and 8B show the data and analysis for heterozygousmutant salmonella DNA. FIG. 7A shows that the homozygous wild typesalmonella DNA is correctly fit to one peak. FIG. 7B shows that thehomozygous wild type salmonella DNA is incorrectly over-fit to twopeaks. Quantitatively, the F_(ratio) is 2.45 in this case. This numberis not much bigger than 1 and the second peak is deemed unnecessary.Thus, a single peak or feature adequately describes the shape of themelting curve for the homozygous wild type salmonella DNA. FIG. 8A showsthat the heterozygous mutant salmonella DNA is inadequately fit to onepeak. FIG. 8B shows that the heterozygous mutant salmonella DNA isproperly fit to two peaks. Quantitatively, the F_(ratio) is 106.41 inthis case. This number is much bigger than 1 and the second peak isdeemed necessary. Thus, a second peak or feature is needed to adequatelydescribe the shape of the melting curve for the heterozygous mutantsalmonella DNA. This process can be repeated to evaluate whether a3^(rd) or 4^(th) feature is necessary as shown in the flowchart in FIG.6.

The Residual Runs test can also be used to compare fit models todetermine whether an added peak is necessary to describe a melt curve.The residual is the difference of the fit curve (thin line) points fromthe actual data (thick line). Statistically, the residual runs test canbe used to test if a model fit curve adequately describes the data. Arun is a consecutive series of points whose residuals are either allpositive or all negative.

If the data points are randomly distributed above and below fitfluorescence line, it is possible to calculate the expected number ofruns. If there are Np points above the fit curve and Np points below thecurve, the total number of runs follows a normal distribution with amean expected value of [(2*Np*Nn)/(Np+Np)]+1 with a standard deviationof sqrt((2*Np*Nn*(2*Np*Nn−Np−Nn))/(((Np+Nn)̂2)*(Np+Nn−1))). If fewer runsare obtained than the expected value then there may be some dynamic thatis not being modeled for and a more complex model or added feature isrequired to describe the data. The p-value of the total number of runsin the distribution reflects the probability of getting as few or fewerruns as observed in this experiment for a randomly distributed set ofresiduals. Thus if the p value is low, it may be concluded that themodel used to fit the data is inadequate.

In the top subplot of FIG. 7A, the residual fluctuates randomly about 0,indicating that a second peak is unnecessary. In contrast, the topsubplot of FIG. 8A shows a stream of negative residuals followed by astream of positive residuals (resulting in much fewer runs thanexpected), indicating that a second peak is necessary. In the topsubplot of FIG. 8B, the residuals appear to be randomly distributedagain resulting in a larger number of runs.

FIG. 9 illustrates a flow chart for process 900 for the analysis ofdenaturation data to identify a nucleic acid in a biological sample, inaccordance with one embodiment. Process 900 may begin in step 901 inwhich the melt curve data is inputted into an appropriate programmedcomputer. The number of peaks or nucleotides is determined in step 902,e.g., in accordance with the flow chart illustrated in FIG. 6. Once thenumber of peaks or nucleotides has been determined, the thermodynamicparameters are obtained in step 903 from the fit to the model withN_(pk) peaks or nucleotides. The class conditional densities and priorsof parameters for all possible nucleic acids from training set of meltcurves of known nucleic acids is inputted in step 904. In step 905, thefit thermodynamic parameters of a melt curve is compared to those of allpossible nucleic acids. In step 906, the posterior probabilities thatthe nucleic acids of the input melt curve are each of the possiblenucleic acids is calculated. In step 907, a query of the comparison ismade as to whether the nucleic acid with the largest posteriorprobability is greater than a preselected threshold, e.g., 98%. If theanswer is no, the process is completed with no identification of thenucleic acid. If the answer is yes, step 908 is performed. In step 908,the query is made as to whether the fit parameters are contained withina reasonable percentage of the class conditional densities for thenucleic acid with the largest posterior probability. If the answer isno, the process is completed with no identification of the nucleic acid.If the answer is yes, step 909 outputs the nucleic acid of the meltcurve as the one with the largest posterior probability.

In accordance with the present invention, the derivation of −dF/dT curveis not necessary in the determination of melting temperatures and otherparameters in the thermodynamic mathematical model. However, once theparameters of the mathematical model have been identified, theanalytical −dF/dT of the melt curve can be obtained, as persons skilledin the art conventionally look at melt data this way. For the i^(th)feature or peak, the analytical derivative of Equation 3 substitutedinto Equation 2 is given by:

dFdT(i)=½*A(i)*((8*exp(dH(i)*(T−Tm(i))/R/Tm(i)/T)+1)̂(½)−4*exp(dH(i)*(T−Tm(i))/R/Tm(i)/T)−1)*(−dH(i)+k*(8*exp(dH(i)*(T−Tm(i))/R/Tm(i)/T)+1)̂(½)*R*T̂2)/(8*exp(dH(i)*(T−Tm(i))/R/Tm(i)/T)+1)̂(½)/R/T̂2*exp((−T+Tm(i))*k−dH(i)*(T−Tm(i))/R/Tm(i)/T);where

dFdT(i) is

$\frac{F_{i}}{t},$

the derivative of F_(i), Tm(i) is T_(mi) (in Kelvin), dH(i) is ΔH_(i),A(i) is A_(i), k is k, R is the universal gas constant, and T is thetemperature shown on the x axis (in Kelvin).

The total derivative is given by:

$\frac{F_{sum}}{t} = {\sum\limits_{i = 1}^{N_{pk}}{\frac{F_{i}}{t}.}}$

It should be noted that this method of generating −dF/dT can be appliedto any set of equations relating fluorescence and temperature and is notlimited to Equations 1 through 3.

The advantages to obtaining the derivative of the melt curve this wayinclude, for example, that (i) it is closer to the true derivative asthere is no low pass filtering attenuating the signal (see FIG. 2), and(ii) derivatives of individual features show isolated multiple peakswhen they are close together. The total derivative merges them togethermaking them indistinguishable (see FIG. 4), which is a potential problemin using SG filters.

Thus, the invention provides methods and systems for the analysis ofdenaturation data in the detection of nucleic acid that contains one ormore peaks or mutations. This aspect of the invention overcomes problemsthat arise with the use of SG filters in obtaining the negativederivative curve.

Thus, in accordance with these aspects, the present invention provides amethod for identifying a nucleic acid in a sample including at least oneunknown nucleic acid. In accordance with this aspect, the methodcomprises fitting denaturation data, including measurements of aquantifiable physical change P of the sample at a plurality ofindependent sample property points x, to a function P(x, Q) to determinean intrinsic physical value Q and to obtain an estimated physical changefunction, wherein the intrinsic physical value Q is an intrinsicphysical value associated with the nucleic acid in the sample, and thequantifiable physical change P is associated with the denaturation of anucleic acid. In one embodiment, the fitting is performed withoutdetermining the change in heat capacity of the sample.

In one embodiment, the method further comprises identifying the nucleicacid in the biological sample by comparing the intrinsic physical valueQ for at least one unknown nucleic acid to an intrinsic physical value Qfor a known nucleic acid. In another embodiment, the method furthercomprises identifying the nucleic acid in the biological sample bycomparing the intrinsic physical value Q for at least one unknownnucleic acid is made to an a priori distribution of intrinsic physicalvalues Q for a known nucleic acid to determine if the unknown nucleicacid and the known nucleic acid are identical. In another embodiment, Qis one or more fitting parameters Y, Z and W and the denaturation datais fit to one or more of these fitting parameters to determine one ormore intrinsic physical values Y, Z and W. Each of the intrinsicphysical values Y, Z and W is associated with a nucleic acid. Forexample, in one embodiment, Y is T_(m), Z is van't Hoff enthalpy ΔH andW is amplitude A. In an additional embodiment, one or more of theintrinsic physical values Y, Z and W are determined and compared. In afurther embodiment, two or more of the intrinsic physical values Y, Zand W are determined and compared. In still yet another embodiment, allof the intrinsic physical values Y, Z and W are determined and compared.

In one embodiment, the fitting step includes fitting the denaturationdata to a function P(x, Q) using a computer-implemented non-linear leastsquares algorithm. In another embodiment, the non-linear least squaresalgorithm determines the value of at least one fitting parameter, andthe intrinsic physical value Y is a fitting parameter whose value isdetermined by the non-linear least squares algorithm. In a furtherembodiment, the fitting step includes fitting the denaturation data to afunction P_(i)(x, Q) to obtain a first estimated physical changefunction in which the function P₁(x, Q) describes the relationship ofthe quantifiable physical change of a sample containing one nucleic acidto the independent sample property x, and the method further comprisesthe steps of (i) fitting the denaturation data to a function P₂(x, Q₁,Q₂) to obtain a second estimated physical change function, in which thefunction P₂(x, Q₁, Q₂) describes the relationship of the quantifiablephysical change of a sample containing 2 distinct nucleic acids to theindependent sample property x of a sample containing 2 distinct nucleicacids, and (ii) quantifying the number of distinct nucleic acids in thesample by comparing the first and second estimated physical changefunctions with the denaturation data to determine if 1 or 2 distinctunknown nucleic acids are present in the sample. In another embodiment,the determining step includes determining a melting temperature T_(m(1))for a first unknown nucleic acid and determining a melting temperatureT_(m(2)) for a second unknown nucleic acid.

In one embodiment, the method further comprises the step of calculatingthe binding fraction g_(i) of each nucleotide in the nucleic acid fromthe fit parameters. In another embodiment, the method further comprisesthe steps of calculating a derivative of the estimated physical changefunction dP/dx and displaying a plot of said derivative. In a furtherembodiment, the independent sample property x is the sample temperatureT and the intrinsic physical value Y is the melting temperature T_(m).In one embodiment, the fitting step is performed by fitting thedenaturation data to an equation of the form:

$\begin{matrix}{{{P\left( {T,T_{m}} \right)} = {B + {\sum\limits_{i = 1}^{n}P_{i}}}},{wherein}} & \left( {{Eq}.\mspace{14mu} 1} \right) \\{{P_{i} = {2A_{i}g_{i}^{\lbrack{- {k{({T - T_{m_{i}}})}}}\rbrack}}},} & \left( {{Eq}.\mspace{14mu} 2} \right) \\{{g_{i} = {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}} - \sqrt{\left( {1 + \frac{1}{4^{\lbrack{\frac{\Delta \; H_{i}}{R}{({\frac{1}{T_{m_{i}}} - \frac{1}{T}})}}\rbrack}}} \right)^{2} - 1}}},} & \left( {{Eq}.\mspace{14mu} 3} \right)\end{matrix}$

and wherein

B is the baseline measurement of the quantifiable physical property inabsence of the sample, A_(i) is the amplitude of the measurement of thequantifiable physical property of the ith nucleic acid present in thesample, ΔH_(i) is the van't Hoff enthalpy of denaturation of the ithnucleic acid present in the sample, T_(m(i)) is the melting temperatureof the ith nucleic acid present in the sample, and R is the universalgas constant.

In one embodiment, the biological sample further includes a fluorescentdye, and the quantifiable physical change is fluorescence intensity formeasuring the denaturation of the molecules in the sample. Suitablefluorescent dyes for detecting denaturation of molecules are well knownin the art and include those described herein, such as the double-strandspecific dyes of the SYBR Green family of dyes and LCGreen Plus. Anysuitable fluorescent dye can be used in the methods of the invention. Inanother embodiment, the quantifiable physical change is ultravioletlight for measuring the denaturation of the molecules in the sample. Infurther embodiment, the method further comprises the step of generatingdenaturation data from the sample. In one embodiment, the denaturationdata is thermal melt data. In another embodiment, the unknown nucleicacid is a nucleic acid containing a single nucleotide polymorphism.

In another aspect, the invention provides methods and systems foridentifying a nucleic acid in a sample including an unknown nucleic acidor for detecting a single nucleotide polymorphism in a nucleic acid in asample including at least one unknown nucleic acid. Some nucleic acidassays require identification of a single nucleotide change where thedifference in T_(m) between the wild type and mutant nucleic acid issmall, such as, for example, less than 0.25° C. This level oftemperature resolution is difficult, if not impossible, in standard 96and 384 well plates. Decreasing the area of thermal analysis can improvethe spatial temperature gradient, but there is still significant noisegenerated from the heating device used to linearly ramp the samplesduring a thermal melt. It would be of great benefit to define the curveby multiple variables to increase the confidence of identifying acertain nucleic acid when comparing patient thermal melt curves tocontrol thermal melt curves that are differentiated by smalldifferences, such as, for example, less than 0.25° C.

In accordance with this aspect, methods are disclosed that utilize morethan one variable to describe a thermal melt curve and allow a higherconfidence when comparing unknown patient results to a control data set.In one embodiment, fluorescence data generated from a thermal melt curvecan be defined by one or more variables, which include a slope andintercept from the upper and lower baselines, a midpoint of thetransition (T_(m)) and the width of the transition (van't Hoff enthalpy,ΔH). Defining a thermal melt curve by both a T_(m) and van't Hoffenthalpy would further refine the characteristics of the control datathat is to be compared to data from unknown nucleic acid in a biologicalsample.

By defining a control sample by a T_(m) and van't Hoff enthalpy, ΔH, inaccordance with one embodiment, greater confidence in accuratelyidentifying a nucleic acid in a biological sample is obtained. The van'tHoff enthalpy is more sensitive to data quality and is more likely tohighlight differences in data sets generated by control nucleic acidswith small differences in T_(m)'s. Along with the T_(m), van't Hoffenthalpy ΔH and amplitude A, the non nucleotide specific parameter, k,derived from the melting curve can be used to further define the dataand to identify the unknown nucleic acid in a biological sample.

Thus, in accordance with this aspect of the invention, methods areprovided for identifying a nucleic acid in a sample including an unknownnucleic acid. According to this aspect, the method comprises fittingdenaturation data, including measurements of a quantifiable physicalchange P of the sample at a plurality of independent sample propertypoints x, to a function P(x, Y, Z) to determine intrinsic physicalvalues Y and Z for the unknown nucleic acid and to obtain an estimatedphysical change function, wherein intrinsic physical values Y and Z aredistinct intrinsic physical values associated with the nucleic acid inthe sample and wherein the quantifiable physical change P is associatedwith denaturation of a nucleic acid. The method further comprisesidentifying the nucleic acid in the biological sample by comparing theintrinsic physical values Y and Z for the unknown nucleic acid to apriori distributions of the intrinsic physical values Y and Z for theknown nucleic acid to determine if the unknown nucleic acid and theknown nucleic acid are identical.

In one embodiment, the sample further includes a fluorescent dye formeasuring the denaturation of the molecules in the sample. Suitablefluorescent dyes for detecting denaturation of molecules are well knownin the art and include those described herein, such as the double-strandspecific dyes of the SYBR Green family of dyes and LCGreen Plus. Anysuitable fluorescent dye can be used in the methods of the invention. Inanother embodiment, the quantifiable physical change is fluorescenceintensity. In a further embodiment, quantifiable physical change P isultraviolet absorbance.

In a further embodiment, x is the sample temperature T, Y is the meltingtemperature T_(m) and Z is the van't Hoff enthalpy ΔH. In oneembodiment, the fitting step includes fitting the denaturation data to afunction P(T, T_(m), ΔH, A) in which A is amplitude to determine one ormore nucleic acid specific parameters for the unknown nucleic acid. Inanother embodiment, the identifying step includes comparing one or morenucleic acid specific parameters for the unknown nucleic acid to one ormore nucleic acid specific parameters for a known nucleic acid todetermine if the unknown nucleic acid and the known nucleic acid areidentical. In another embodiment, the method further comprises the stepof calculating a derivative function dP/dx of the estimated physicalchange function and displaying a plot of the derivative function. In afurther embodiment, the method further comprises the step of generatingdenaturation data from a sample including an unknown nucleic acid. Inone embodiment, the denaturation data is thermal melt data.

In another embodiment, the present invention also provides a method fordetecting a single nucleotide polymorphism in a nucleic acid in a sampleincluding at least one unknown nucleic acid. According to this aspect,the method comprises fitting denaturation data to a function P(x, Y, Z)to determine intrinsic physical values Y and Z for the unknown nucleicacid and to obtain an estimated physical change function, wherein thedenaturation data includes measurements of a quantifiable physicalchange P of the sample at a plurality of independent sample propertypoints x, wherein intrinsic physical values Y and Z are distinctintrinsic physical values associated with the nucleic acid containing asingle nucleotide polymorphism, and wherein the quantifiable physicalchange P is associated with denaturation of a nucleic acid. The methodsfurther comprise detecting the presence of a single nucleotidepolymorphism by comparing the intrinsic physical values Y and Z for theunknown nucleic acid to the intrinsic physical values Y and Z for aknown nucleic acid to determine if the unknown nucleic acid is a singlenucleotide polymorphism of the known nucleic acid. In one embodiment,the comparison of the intrinsic physical values Y and Z for at least oneunknown nucleic acid is made to an a priori distribution of theintrinsic physical values Y and Z for a known nucleic acid to determineif the unknown nucleic acid and the known nucleic acid are identical.

In one embodiment, the sample further includes a fluorescent dye formeasuring the denaturation of the molecules in the sample. Suitablefluorescent dyes for detecting denaturation of molecules are well knownin the art and include those described herein, such as the double-strandspecific dyes of the SYBR Green family of dyes and LCGreen Plus. Anysuitable fluorescent dye can be used in the methods of the invention. Inanother embodiment, the quantifiable physical change is fluorescenceintensity. In a further embodiment, quantifiable physical change P isultraviolet absorbance. In one embodiment, x is the sample temperatureT, Y is the melting temperature T_(m) and wherein Z is the van't Hoffenthalpy 4H. In another embodiment, the fitting step includes fittingthe denaturation data to a function P(T, T_(m), ΔH, A) in which A is theamplitude to determine one or more of the nucleic acid specificparameters for the unknown nucleic acid. In a further embodiment, theidentifying step includes comparing the nucleic acid specific parametersfor the unknown nucleic acid to the nucleic acid specific parameters fora known nucleic acid to determine if the unknown nucleic acid and theknown nucleic acid are identical. In one embodiment, the method furthercomprises the step of calculating a derivative function dP/dx of theestimated physical change function and displaying a plot of thederivative function. In another embodiment, the method further comprisesthe step of generating denaturation data from a sample including anunknown nucleic acid. In one embodiment, the denaturation data isthermal melt data.

In accordance with other aspects, the present invention also provides asystem for identifying a nucleic acid in a sample including at least oneunknown nucleic acid. An example of a suitable system in accordance withsome aspects of the invention is illustrated in connection with FIG. 10.As illustrated in FIG. 10, system 100 may include a microfluidic device102. Microfluidic device 102 may include one or more microfluidicchannels 104. In the examples shown, device 102 includes twomicrofluidic channels, channel 104 a and channel 104 b. Although onlytwo channels are shown in the exemplary embodiment, it is contemplatedthat device 102 may have fewer than two or more than two channels. Forexample, in some embodiments, device 102 includes eight channels 104.

Device 102 may include two DNA processing zones, a DNA amplificationzone 131 (a.k.a., PCR zone 131) and a DNA melting zone 132. A DNA sampletraveling through the PCR zone 131 may undergo PCR, and a DNA samplepassing through melt zone 132 may undergo high resolution thermalmelting. As illustrated in FIG. 10, PCR zone 131 includes a firstportion of channels 104 and melt zone 132 includes a second portion ofchannels 104, which is down stream from the first portion.

Device 102 may also include a sipper 108. Sipper 108 may be in the formof a hollow tube. Sipper 108 has a proximal end that is connected to aninlet 109 which inlet couples the proximal end of sipper 108 to channels104. Device 102 may also include a common reagent well 106 which isconnected to inlet 109. Device 102 may also include a locus specificreagent well 105 for each channel 104. For example, in the embodimentshown, device 102 includes a locus specific reagent well 105 a, which isconnected to channel 104 a, and may include a locus specific reagentwell 105 b which is connected to channel 104 b. Device 102 may alsoinclude a waste well 110 for each channel 104.

The solution that is stored in the common reagent well 106 may containdNTPs, polymerase enzymes, salts, buffers, surface-passivating reagents,one or more non-specific fluorescent DNA detecting molecules, a fluidmarker and the like. The solution that is stored in a locus specificreagent well 105 may contain PCR primers, a sequence-specificfluorescent DNA probe or marker, salts, buffers, surface-passivatingreagents and the like.

In order to introduce a sample solution into the channels 104, system100 may include a well plate 196 that includes a plurality of wells 198,at least some of which contain a sample solution (e.g., a solutioncontaining a DNA sample). In the embodiment shown, well plate 196 isconnected to a positioning system 194 which is connected to a maincontroller 130.

Main controller 130 may be implemented, for example, using a PXI-8105controller which is available from National Instruments Corporation ofAustin, Tex. Positioning system 194 may include a positioner (e.g., theMX80 positioner available from Parker Hannifin Corporation of PA(“Parker”)) for positioning well plate 196, a stepping drive (e.g., theE-AC Microstepping Drive available from Parker) for driving thepositioner, and a controller (e.g., the 6K4 controller available fromParker) for controlling the stepping drive.

To introduce a sample solution into the channels 104, the positioningsystem 194 is controlled to move well plate 196 such that the distal endof sipper 108 is submerged in the sample solution stored in one of thewells 198. FIG. 10 shows the distal end of 108 being submerged withinthe sample solution stored in well 198 n.

In order to force the sample solution to move up the sipper and into thechannels 104, a vacuum manifold 112 and pump 114 may be employed. Thevacuum manifold 112 may be operably connected to a portion of device 102and pump 114 may be operably connected to manifold 112. When pump 114 isactivated, pump 114 creates a pressure differential (e.g., pump 114 maydraw air out of a waste well 110), and this pressure differential causesthe sample solution stored in well 198 n to flow up sipper 108 andthrough inlet channel 109 into channels 104. Additionally, this causesthe reagents in wells 106 and 105 to flow into a channel. Accordingly,pump 114 functions to force a sample solution and real-time PCR reagentsto flow through channels 104. As illustrated in FIG. 10, melt zone 132is located downstream from PCR zone 131. Thus, a sample solution willflow first through the PCR zone and then through the melting zone.

Referring back to well plate 196, well plate 196 may include a buffersolution well 198 a. In one embodiment, buffer solution well 198 a holdsa buffer solution 197. Buffer solution 197 may comprise a conventionalPCR buffer, such as a conventional real-time (RT) PCR buffer.Conventional PCR buffers are available from a number of suppliers,including: Bio-Rad Laboratories, Inc., Applied Biosystems, RocheDiagnostics, and others.

In order to achieve PCR for a DNA sample flowing through the PCR zone131, the temperature of the sample must be cycled, as is well known inthe art. Accordingly, in some embodiments, system 100 includes atemperature control system 120. The temperature control system 120 mayinclude a temperature sensor, a heater/cooler, and a temperaturecontroller. In some embodiments, a temperature control system 120 isinterfaced with main controller 130 so that main controller 130 cancontrol the temperature of the samples flowing through the PCR zone andthe melting zone. Main controller 130 may be connected to a displaydevice for displaying a graphical user interface. Main controller 130may also be connected to user input devices 134, which allow a user toinput data and commands into main controller 130.

To monitor the PCR process and the melting process that occur in PCRzone 131 and melt zone 132, respectively, system 100 may include animaging system 118. Imaging system 118 may include an excitation source,an image capturing device, a controller, and an image storage unit.Other aspects of a suitable system in accordance with some aspects ofthe invention are disclosed in U.S. patent application Ser. No.11/770,869, incorporated herein by reference in its entirety.

The system 100 further includes an appropriately controllable computerin communication with the user input devices 134, display device 132 andthe main controller 130. The computer receives information from, amongmany sources, the imaging system 118 and temperature control system 120and enables the identification of a nucleic acid in a sample includingan unknown nucleic acid in accordance with some aspects of theinvention.

According to this aspect, the system for identifying a nucleic acid in asample including at least one unknown nucleic acid comprises a fittingmodule capable of fitting denaturation data received from, among othersources, the imaging system and temperature control system includingmeasurements of a quantifiable physical change P of the sample at aplurality of independent sample property points x to a function P(x, Q)to determine intrinsic physical value Q for the unknown nucleic acid,wherein Q is a distinct intrinsic physical value associated with thenucleic acid in the sample, and wherein the quantifiable physical changeP is associated with denaturation of a nucleic acid. In accordance withone embodiment, the fitting module comprises an appropriately programmedcomputer or software stored on a computer readable medium (e.g., anon-volatile storage device or other storage device), where the softwareis configured such that when executed by a computer, the softwareenables the computer to fit denaturation data to a function P(x, Q) todetermine intrinsic physical value Q.

The system in accordance with some aspects of the invention furthercomprises an identification module capable of identifying the nucleicacid in the biological sample by comparing the intrinsic physical valueQ value for the unknown nucleic acid to the intrinsic physical value Qvalue for a known nucleic acid to determine if the unknown nucleic acidand the known nucleic acid are identical. In accordance with oneembodiment, the identification module comprises an appropriatelyprogrammed computer or software stored on a computer readable medium,where the software is configured such that when executed by a computer,the software enables the computer to compare the intrinsic physicalvalue Q value for the unknown nucleic acid to the intrinsic physicalvalue Q value for a known nucleic acid to determine if the unknownnucleic acid is a single nucleotide polymorphism of the known nucleic.

In one embodiment, Q is one or more fitting parameters Y, Z and W andthe denaturation data is fit to one or more of these fitting parametersto determine one or more intrinsic physical values Y, Z and W. Inanother embodiment, one or more of the intrinsic physical values Y, Zand W are determined and compared. In an additional embodiment, two ormore of the intrinsic physical values Y, Z and W are determined andcompared. In a further embodiment, all of the intrinsic physical valuesY, Z and W are determined and compared. In one embodiment, x is thesample temperature T, Y is the melting temperature T_(m), Z is the van'tHoff enthalpy ΔH and W is the amplitude A. In some embodiments, Y and Zare determined and compared in the method according to the invention.

In one embodiment, the biological sample further includes a fluorescentdye, and the quantifiable physical change is fluorescence intensity formeasuring the denaturation of the molecules in the sample. Suitablefluorescent dyes for detecting denaturation of molecules are well knownin the art and include those described herein, such as the double-strandspecific dyes of the SYBR Green family of dyes and LCGreen Plus. Anysuitable fluorescent dye can be used in the methods of the invention. Ina further embodiment, the quantifiable physical change is fluorescenceintensity.

In one embodiment, the fitting module is further capable of fitting thedenaturation data to a function P(T, T_(m), ΔH, A) in which A isamplitude to determine one or more nucleic acid specific parameters forthe unknown nucleic acid. In another embodiment, the identifying moduleis further capable of comparing the nucleic acid specific parameters forthe unknown nucleic acid to the nucleic acid specific parameters for aknown nucleic acid to determine if the unknown nucleic acid and theknown nucleic acid are identical. In a further embodiment, the systemfurther comprises a single-nucleotide-polymorphism detection modulecapable of comparing intrinsic physical values Y and Z for the unknownnucleic acid to a priori distributions of the intrinsic physical valuesY and Z for a known nucleic acid to determine if the unknown nucleicacid is a single nucleotide polymorphism of the known nucleic acid. Inone embodiment, the system further comprises a generating unit capableof generating denaturation data from a sample. In one embodiment, thegenerating unit comprises an appropriately programmed computer orsoftware stored on a computer readable medium, where the software isconfigured such that when executed by a computer, the software enablesthe computer to generate denaturation data including measurements of aquantifiable physical change P of the sample at a plurality ofindependent sample property points x for a sample. In one embodiment,the denaturation data is thermal melt data.

In one embodiment, the fitting module is a computer containinginstructions for performing a non-linear least squares algorithm. Inanother embodiment, the independent sample property x is the sampletemperature T In a further embodiment, the intrinsic physical value Y isthe melting temperature T_(m). In one embodiment, the computercontaining instructions for performing a non-linear least squaresalgorithm further contains instructions for fitting the denaturationdata to a function as described herein.

In one embodiment, the fitting module includes a computer or softwarestored on a computer readable medium containing instructions forperforming a non-linear least squares algorithm to determine the valueof at least one fitting parameter, in which the melting temperatureT_(m) is a fitting parameter whose value is capable of being determinedby the non-linear least squares algorithm. In another embodiment, thesystem further comprises a generating unit capable of generatingdenaturation data from the sample. In one embodiment, the denaturationdata is thermal melt data. The generating unit is as described herein.In another embodiment, the single-nucleotide-polymorphism module isfurther capable of comparing nucleic acid specific parameters for theunknown nucleic acid to the nucleic acid specific parameters for a knownnucleic acid to determine if the unknown nucleic acid is a singlenucleotide polymorphism of the known nucleic acid.

In other aspects, the present invention further provides a method forquantifying the number of distinct nucleic acids in a sample, whereinthe sample includes at least one nucleic acid. This method comprisesfitting denaturation data, including measurements of a quantifiablephysical change P of the sample at a plurality of independent sampleproperty points x, to a function P_(n)(x) to obtain a first estimatedphysical change function, wherein the function P_(n)(x) describes therelationship of the quantifiable physical change of a sample containingn distinct nucleic acids to an independent sample property x of a samplecontaining n distinct nucleic acids, and the quantifiable physicalchange P is associated with the denaturation of a nucleic acid. Themethod further comprises fitting the denaturation data to a functionP_(n+1)(x) to obtain a second estimated physical change function,wherein said function P_(n+1)(x) describes the relationship of thequantifiable physical change of a sample containing n+1 distinct nucleicacids to an independent sample property x of a sample containing n+1distinct nucleic acids. The method also comprises quantifying the numberof distinct nucleic acids in the sample by comparing the first andsecond estimated physical change functions with the denaturation data todetermine if n or n+1 different nucleic acids are present in the sample.

In one embodiment, the first fitting step includes determining theintrinsic physical value Q₁ for at least one of the nucleic acidspresent in the sample, and wherein said second fitting step includesdetermining an intrinsic physical value Q₂ for at least one of thenucleic acids in the sample. In another embodiment, the first fittingstep includes fitting the denaturation data to a function P_(n)(x) usinga computer-implemented fitting algorithm, and the second fitting stepincludes fitting the denaturation data to a function P_(n+1)(x) using acomputer-implemented fitting algorithm. In a further embodiment, theindependent sample property is the sample temperature T In oneembodiment, the computer-implemented fitting algorithms are non-linearleast squares algorithms. In another embodiment, the non-linear leastsquares algorithm determines the value of at least one fittingparameter, in which the melting temperature T_(m) of at least onenucleic acid is a fitting parameter whose value is determined by thenon-linear least squares algorithm. In a further embodiment, the methodfurther comprises the step of generating denaturation data from thesample. In one embodiment, the denaturation data is thermal melt data.

The present invention also provides a system for quantifying the numberof distinct nucleic acids in a sample, said sample including at leastone nucleic acid. This system comprises a fitting module capable offitting denaturation data including measurements of a quantifiablephysical change P of the sample at a plurality of independent sampleproperty points x to a function P_(n)(x) to obtain an n nucleic acidestimated physical change function, wherein said function P_(n)(x)describes the relationship of the quantifiable physical change of asample containing n distinct nucleic acids to the independent sampleproperty of a sample containing n distinct nucleic acids, and thequantifiable physical change is associated with the denaturation of anucleic acid. The system further comprises a quantification modulecapable of quantifying the number of distinct nucleic acids in thesample by comparing an n nucleic acid physical change function and ann+1 nucleic acid physical change function with the denaturation data todetermine if n or n+1 different nucleic acids are present in the sample.

In accordance with one embodiment of this aspect of the invention, thefitting module comprises an appropriately programmed computer orsoftware stored on a computer readable medium, where the software isconfigured such that when executed by a computer, the software enablesthe computer to fit denaturation data including measurements of aquantifiable physical change P of the sample at a plurality ofindependent sample property points x to a function P_(n)(x) to obtain ann nucleic acid estimated physical change function, wherein said functionP_(n)(x) describes the relationship of the quantifiable physical changeof a sample containing n distinct nucleic acids to the independentsample property of a sample containing n distinct nucleic acids, and thequantifiable physical change is associated with the denaturation of anucleic acid. In another embodiment, the quantification module is anappropriately programmed computer or software stored on a computerreadable medium, where the software is configured such that whenexecuted by a computer, the software enables the computer to determineif n or n+1 different nucleic acids are present in the sample.

In one embodiment, the independent sample property x is the sampletemperature T. In another embodiment, the fitting module is capable ofestimating the melting temperature T_(m) for at least one of the nucleicacids present in the sample, in which the second fitting step includesestimating the melting temperature T_(m) for at least one of the nucleicacids in the sample. In a further embodiment, the fitting module is acomputer containing instructions for fitting the denaturation data to afunction P_(n)(T) via a non-linear least squares algorithm. In oneembodiment, the non-linear least squares algorithm is capable ofdetermining the value of at least one fitting parameter, in which themelting temperature T_(m) of at least one nucleic acid is a fittingparameter whose value is capable of being determined by the non-linearleast squares algorithm. In another embodiment, the system furthercomprises a generating unit capable of generating denaturation data fromthe sample. In one embodiment, the denaturation data is thermal meltdata. The generating unit is as described herein.

The use of the terms “a” and “an” and “the” and similar referents in thecontext of describing the invention (especially in the context of thefollowing claims) are to be construed to cover both the singular and theplural, unless otherwise indicated herein or clearly contradicted bycontext. The terms “comprising,” “having,” “including,” and “containing”are to be construed as open-ended terms (i.e., meaning “including, butnot limited to,”) unless otherwise noted. Recitation of ranges of valuesherein are merely intended to serve as a shorthand method of referringindividually to each separate value falling within the range, unlessotherwise indicated herein, and each separate value is incorporated intothe specification as if it were individually recited herein. Forexample, if the range 10-15 is disclosed, then 11, 12, 13, and 14 arealso disclosed. All methods described herein can be performed in anysuitable order unless otherwise indicated herein or otherwise clearlycontradicted by context. The use of any and all examples, or exemplarylanguage (e.g., “such as”) provided herein, is intended merely to betterilluminate the invention and does not pose a limitation on the scope ofthe invention unless otherwise claimed. No language in the specificationshould be construed as indicating any non-claimed element as essentialto the practice of the invention.

It will be appreciated that the methods and compositions of the instantinvention can be incorporated in the form of a variety of embodiments,only a few of which are disclosed herein. Variations of thoseembodiments may become apparent to those of ordinary skill in the artupon reading the foregoing description. The inventors expect skilledartisans to employ such variations as appropriate, and the inventorsintend for the invention to be practiced otherwise than as specificallydescribed herein. Accordingly, this invention includes all modificationsand equivalents of the subject matter recited in the claims appendedhereto as permitted by applicable law. Moreover, any combination of theabove-described elements in all possible variations thereof isencompassed by the invention unless otherwise indicated herein orotherwise clearly contradicted by context.

1. (canceled)
 2. A method for identifying one or more nucleic acids inat least one sample, the method comprising: providing a microfluidicdevice comprising at least one microfluidic channel; filling at leastone of the microfluidic channels with at least one sample; amplifyingthe at least one sample in at least one of the microfluidic channels;increasing the temperature of the at least one sample while illuminatingthe sample and obtaining images of the sample depicting a signalemanating from the sample during the temperature increase; processingthe images to obtain thermal dissociation data for the one or morenucleic acids; providing a mathematical representation to model thethermal dissociation data, the mathematical representation being a sumincluding one or more terms, wherein each of the one or more terms isassociated with one of the nucleic acids in the sample, wherein each ofthe one or more terms depends on the temperature of the sample andparameters defining the one of the nucleic acids in the sample; fittingthe measured thermal dissociation data to the mathematicalrepresentation to calculate the parameters defining each of the one ormore nucleic acids in the sample; and identifying each of the one ormore nucleic acids in the sample based on the calculated parameters. 3.The method of claim 2, further comprising: determining a number of peaksin the thermal dissociation data presented as a thermal dissociationcurve; and modeling the thermal dissociation data based on thedetermined number of peaks, wherein a number of the terms representingnucleic acids in the mathematical representation equals the number ofpeaks in the thermal dissociation curve.
 4. The method of claim 2,wherein the step of identifying each of the one or more nucleic acidscomprises comparing the calculated parameters for each nucleic acid inthe sample to parameters defining known nucleic acids.
 5. The method ofclaim 2, further comprising modelling a homozygous wild type by one termin the mathematical representation.
 6. The method of claim 2, furthercomprising modelling a heterozygous mutant by two terms in themathematical representation.
 7. The method of claim 2, wherein themathematical representation includes a baseline signal measured inabsence of the nucleic acids.
 8. The method of claim 2, wherein theparameters defining the one or more nucleic acids are used as inputs toa classifier trained to predict a genotype of the sample.
 9. The methodof claim 2, wherein the sample further includes a fluorescent dye, andwherein the signal emanating from the sample and indicative of a nucleicacid denaturation is selected from fluorescence intensity andultraviolet absorbance of the dye.
 10. A system for identifying one ormore nucleic acids in a sample, the method comprising: a microfluidicdevice comprising at least one microfluidic channel; a heating unit toincrease and decrease temperature of the at least one microfluidicchannel; a measuring unit to obtain thermal melt data for the one ormore nucleic acids by measuring a signal emanating from the sample asthe temperature of the sample is ramped, the signal being indicative ofa nucleic acid denaturation; a modelling unit to provide a mathematicalrepresentation for the thermal melt data, the mathematical model being asum including one or more terms, wherein each of the one or more termsis associated with one of the nucleic acids in the sample and dependsfrom the temperature of the sample and parameters defining the one ofthe nucleic acids in the sample; a fitting unit to fit the measuredmelting data to the mathematical representation to calculate theparameters defining the one or more nucleic acids in the sample; and anidentification module to identify each of the one or more nucleic acidsbased on the calculated parameters.